528 research outputs found

    VGAN-Based Image Representation Learning for Privacy-Preserving Facial Expression Recognition

    Full text link
    Reliable facial expression recognition plays a critical role in human-machine interactions. However, most of the facial expression analysis methodologies proposed to date pay little or no attention to the protection of a user's privacy. In this paper, we propose a Privacy-Preserving Representation-Learning Variational Generative Adversarial Network (PPRL-VGAN) to learn an image representation that is explicitly disentangled from the identity information. At the same time, this representation is discriminative from the standpoint of facial expression recognition and generative as it allows expression-equivalent face image synthesis. We evaluate the proposed model on two public datasets under various threat scenarios. Quantitative and qualitative results demonstrate that our approach strikes a balance between the preservation of privacy and data utility. We further demonstrate that our model can be effectively applied to other tasks such as expression morphing and image completion

    A fully-convolutional neural network for background subtraction of unseen videos

    Full text link
    Background subtraction is a basic task in computer vision and video processing often applied as a pre-processing step for object tracking, people recognition, etc. Recently, a number of successful background-subtraction algorithms have been proposed, however nearly all of the top-performing ones are supervised. Crucially, their success relies upon the availability of some annotated frames of the test video during training. Consequently, their performance on completely “unseen” videos is undocumented in the literature. In this work, we propose a new, supervised, backgroundsubtraction algorithm for unseen videos (BSUV-Net) based on a fully-convolutional neural network. The input to our network consists of the current frame and two background frames captured at different time scales along with their semantic segmentation maps. In order to reduce the chance of overfitting, we also introduce a new data-augmentation technique which mitigates the impact of illumination difference between the background frames and the current frame. On the CDNet-2014 dataset, BSUV-Net outperforms stateof-the-art algorithms evaluated on unseen videos in terms of several metrics including F-measure, recall and precision.Accepted manuscrip

    BSUV-Net: a fully-convolutional neural network for background subtraction of unseen videos

    Full text link
    Background subtraction is a basic task in computer vision and video processing often applied as a pre-processing step for object tracking, people recognition, etc. Recently, a number of successful background-subtraction algorithms have been proposed, however nearly all of the top-performing ones are supervised. Crucially, their success relies upon the availability of some annotated frames of the test video during training. Consequently, their performance on completely “unseen” videos is undocumented in the literature. In this work, we propose a new, supervised, background subtraction algorithm for unseen videos (BSUV-Net) based on a fully-convolutional neural network. The input to our network consists of the current frame and two background frames captured at different time scales along with their semantic segmentation maps. In order to reduce the chance of overfitting, we also introduce a new data-augmentation technique which mitigates the impact of illumination difference between the background frames and the current frame. On the CDNet-2014 dataset, BSUV-Net outperforms stateof-the-art algorithms evaluated on unseen videos in terms of several metrics including F-measure, recall and precision.Accepted manuscrip

    Spatio-Visual Fusion-Based Person Re-Identification for Overhead Fisheye Images

    Full text link
    Person re-identification (PRID) has been thoroughly researched in typical surveillance scenarios where various scenes are monitored by side-mounted, rectilinear-lens cameras. To date, few methods have been proposed for fisheye cameras mounted overhead and their performance is lacking. In order to close this performance gap, we propose a multi-feature framework for fisheye PRID where we combine deep-learning, color-based and location-based features by means of novel feature fusion. We evaluate the performance of our framework for various feature combinations on FRIDA, a public fisheye PRID dataset. The results demonstrate that our multi-feature approach outperforms recent appearance-based deep-learning methods by almost 18% points and location-based methods by almost 3% points in matching accuracy. We also demonstrate the potential application of the proposed PRID framework to people counting in large, crowded indoor spaces

    Action Recognition in Video by Covariance Matching of Silhouette Tunnels

    Full text link
    Abstract—Action recognition is a challenging problem in video analytics due to event complexity, variations in imaging conditions, and intra- and inter-individual action-variability. Central to these challenges is the way one models actions in video, i.e., action representation. In this paper, an action is viewed as a temporal sequence of local shape-deformations of centroid-centered object silhouettes, i.e., the shape of the centroid-centered object silhouette tunnel. Each action is rep-resented by the empirical covariance matrix of a set of 13-dimensional normalized geometric feature vectors that capture the shape of the silhouette tunnel. The similarity of two actions is measured in terms of a Riemannian metric between their covariance matrices. The silhouette tunnel of a test video is broken into short overlapping segments and each segment is classified using a dictionary of labeled action covariance matrices and the nearest neighbor rule. On a database of 90 short video sequences this attains a correct classification rate of 97%, which is very close to the state-of-the-art, at almost 5-fold reduced computational cost. Majority-vote fusion of segment decisions achieves 100 % classification rate. Keywords-video analysis; action recognition; silhouette tun-nel; covariance matching; generalized eigenvalues; I

    Increased transcription in hydroxyurea-treated root meristem cells of Vicia faba

    Get PDF
    Hydroxyurea (HU), an inhibitor of ribonucleotide reductase, prevents cells from progressing through S phase by depletion of deoxyribonucleoside triphosphates. Concurrently, disruption of DNA replication leads to double-strand DNA breaks. In root meristems of Vicia faba, HU triggers cell cycle arrest (preferentially in G1/S phase) and changes an overall metabolism by global activation of transcription both in the nucleoplasmic and nucleolar regions. High level of transcription is accompanied by an increase in the content of RNA polymerase II large subunit (POLR2A). Changes in transcription activation and POLR2A content correlate with posttranslational modifications of histones that play a role in opening up chromatin for transcription. Increase in the level of H4 Lys5 acetylation indicates that global activation of transcription following HU treatment depends on histone modifications

    Behavior subtraction

    Full text link
    Background subtraction has been a driving engine for many computer vision and video analytics tasks. Although its many variants exist, they all share the underlying assumption that photometric scene properties are either static or exhibit temporal stationarity. While this works in many applications, the model fails when one is interested in discovering changes in scene dynamics instead of changes in scene's photometric properties; the detection of unusual pedestrian or motor traffic patterns are but two examples. We propose a new model and computational framework that assume the dynamics of a scene, not its photometry, to be stationary, i.e., a dynamic background serves as the reference for the dynamics of an observed scene. Central to our approach is the concept of an event, which we define as short-term scene dynamics captured over a time window at a specific spatial location in the camera field of view. Unlike in our earlier work, we compute events by time-aggregating vector object descriptors that can combine multiple features, such as object size, direction of movement, speed, etc. We characterize events probabilistically, but use low-memory, low-complexity surrogates in a practical implementation. Using these surrogates amounts to behavior subtraction, a new algorithm for effective and efficient temporal anomaly detection and localization. Behavior subtraction is resilient to spurious background motion, such as due to camera jitter, and is content-blind, i.e., it works equally well on humans, cars, animals, and other objects in both uncluttered and highly cluttered scenes. Clearly, treating video as a collection of events rather than colored pixels opens new possibilities for video analytics.Accepted manuscrip
    • …
    corecore